Statistical machine translation decoder based on phrase
نویسندگان
چکیده
This paper describes a decoding algorithm for statistical machine translation based on phrases. In the past, the solution to the decoding problem were inspired from that of speech recognizers, translating each input word into one or more output words generating in left-to-right direction. The algorithm presented here iteratively constructs phrases or chunks of cepts until all the input words are consumed. This behavior resulted in computational complexity higher than those with left-to-right constraints, though the translation accuracy is better from the Japanese-to-English translation experiments.
منابع مشابه
A Hybrid Machine Translation System Based on a Monotone Decoder
In this paper, a hybrid Machine Translation (MT) system is proposed by combining the result of a rule-based machine translation (RBMT) system with a statistical approach. The RBMT uses a set of linguistic rules for translation, which leads to better translation results in terms of word ordering and syntactic structure. On the other hand, SMT works better in lexical choice. Therefore, in our sys...
متن کاملTranslating Phrases in Neural Machine Translation
Phrases play an important role in natural language understanding and machine translation (Sag et al., 2002; Villavicencio et al., 2005). However, it is difficult to integrate them into current neural machine translation (NMT) which reads and generates sentences word by word. In this work, we propose a method to translate phrases in NMT by integrating a phrase memory storing target phrases from ...
متن کاملPhrase Segmentation Model using Collocation and Translational Entropy
In this paper, we propose a phrase segmentation model for the phrase-based statistical machine translation. We observed that good translation candidates generated by a conventional phrase-based SMT decoder have lexical cohesion and show more uniform translation for each phrase segment. Based on the observation, we propose a novel phrase segmentation model using collocation between two adjacent ...
متن کاملAn Elastic-Phrase Model for Statistical Machine Translation
We present some on-going research on phrase-based Statistical Machine Translation using flexible phrases that may contain gaps of variable lengths. This allows us to naturally handle various linguistic phenomena such as negations or separable particles. We integrate this within the standard Maximum Entropy model using some dedicated feature functions, and describe a beam-search stack decoder th...
متن کاملNP Subject Detection in Verb-Initial Arabic Clauses
Phrase re-ordering is a well-known obstacle to robust machine translation for language pairs with significantly different word orderings. For Arabic-English, two languages that usually differ in the ordering of subject and verb, the subject and its modifiers must be accurately moved to produce a grammatical translation. This operation requires more than base phrase chunking and often defies cur...
متن کاملCohesive Phrase-Based Decoding for Statistical Machine Translation
Phrase-based decoding produces state-of-theart translations with no regard for syntax. We add syntax to this process with a cohesion constraint based on a dependency tree for the source sentence. The constraint allows the decoder to employ arbitrary, non-syntactic phrases, but ensures that those phrases are translated in an order that respects the source tree’s structure. In this way, we target...
متن کامل